eIF4A1-dependent mRNAs employ purine-rich 5’UTR sequences to activate localised eIF4A1-unwinding through eIF4A1-multimerisation to facilitate translation

Abstract Altered eIF4A1 activity promotes translation of highly structured, eIF4A1-dependent oncogene mRNAs at root of oncogenic translational programmes. It remains unclear how these mRNAs recruit and activate eIF4A1 unwinding specifically to facilitate their preferential translation. Here, we show that single-stranded RNA sequence motifs specifically activate eIF4A1 unwinding allowing local RNA structural rearrangement and translation of eIF4A1-dependent mRNAs in cells. Our data demonstrate that eIF4A1-dependent mRNAs contain AG-rich motifs within their 5’UTR which specifically activate eIF4A1 unwinding of local RNA structure to facilitate translation. This mode of eIF4A1 regulation is used by mRNAs encoding components of mTORC-signalling and cell cycle progression, and renders these mRNAs particularly sensitive to eIF4A1-inhibition. Mechanistically, we show that binding of eIF4A1 to AG-rich sequences leads to multimerization of eIF4A1 with eIF4A1 subunits performing distinct enzymatic activities. Our structural data suggest that RNA-binding of multimeric eIF4A1 induces conformational changes in the RNA resulting in an optimal positioning of eIF4A1 proximal to the RNA duplex enabling efficient unwinding. Our data proposes a model in which AG-motifs in the 5’UTR of eIF4A1-dependent mRNAs specifically activate eIF4A1, enabling assembly of the helicase-competent multimeric eIF4A1 complex, and positioning these complexes proximal to stable localised RNA structure allowing ribosomal subunit scanning.


INTRODUCTION
Dysregulation of cellular translation is a prominent feature of many cancers supporting proliferative gene signatures and establishing oncogenic programmes initiated through signalling pathways including KRAS and mTORC (1,2). Downstream of these pathways operates a key factor of eukaryotic translation initiation (eIF), the eIF4F complex, the activity of which links oncogenic signalling to oncogenic protein synthesis (3)(4)(5). eIF4F consists of the cap-binding protein eIF4E, the scaffold protein eIF4G and the ATPdependent DEAD-box RNA helicase eIF4A1 that displays ATPase-dependent RNA strand separation activity.
By virtue of the eIF4F-complex, eIF4A1 catalyses at least two major steps in translation: mRNA loading onto the 43S PIC and its translocation along the 5' UTR to the translation start site (6)(7)(8)(9)(10). Interestingly, the loading function requires only eIF4A1's ATPase activity (6,8), while unwinding is additionally critical for efficient translation of mR-NAs with highly structured 5'UTRs, which are hence considered highly eIF4A1-dependent and include mRNAs of oncogenes such as MYC and BCL2 (4)(5)(11)(12). A variety of approaches have been aimed at identifying and characterising eIF4A1's cellular mRNA targets, which have been shown to have longer and more C/GC-rich 5'UTRs, thus containing more RNA secondary structure (5,(11)(12)(13). Yet, it is still unresolved whether such highly structured eIF4A1dependent mRNAs recruit and activate eIF4A1 unwinding specifically. However, selective inhibition of eIF4A using a variety of natural compounds, including silvestrol, hippuristanol, pateamine A and elatol, have all demonstrated anti-tumour activity through downregulation of eIF4A1dependent genes (5,(14)(15)(16). eIF4A1 binds single-stranded RNA in an ATPdependent manner (17) and ATP-hydrolysis guides the protein through a conformational cycle providing a model for how ATP-turnover and single-stranded RNA-binding are coupled (18)(19)(20)(21). However, while it is understood that eIF4A1 unwinds duplex regions within RNAs, eIF4A1 appears to not associate with dsRNA in a detectable manner (22,23), hence it still remains unclear how exactly the strand separation step of the duplex region is realised during the ATPase-driven conformational cycle. Moreover, eIF4A1 is a weak helicase by itself but its unwinding efficiency is strongly stimulated in the presence of the cofactors eIF4G, eIF4B and eIF4H. This is achieved by complex formation between eIF4A1 and the cofactor proteins that synergistically modulate eIF4A1's conformational cycle (18)(19)(20)(21)(22)(24)(25)(26). Each cofactor is believed to operate at a different step of the cycle but since structural information is lacking, it is unclear how multiple cofactors bind and synergise during a single catalytic cycle. Moreover, the exact role of eIF4A1-cofactors in orchestrating eIF4A1's function in mRNA-loading and unwinding in the translation of specifically eIF4A1-dependent mRNAs is unclear.
Being an essential translation initiation factor, eIF4A1 is considered to bind and load all mRNAs onto ribosomes regardless of RNA sequence and structure (6,8). However, more recent evidence suggests that the RNA itself influences eIF4A1 function: (i) the length of a single-stranded RNA substrate has been shown to enhance the catalytic activities of the yeast eIF4A-eIF4B-eIF4G. However, this study did not dissect the impact of RNA sequence on the activity (27), (ii) members of the eIF4A protein-family preferentially bind to distinct mRNA sets (23,(28)(29) and (iii) rocaglamide-compounds induce translational repression by clamping eIF4A1 sequence-specifically onto AG-repeats (30). Despite this, the role of the RNA substrate itself in regulating eIF4A1 function has not been investigated in detail. We still do not know exactly how different RNA sequences interact with eIF4A1 and its cofactors and how this impacts eIF4A1's function in translation initiation. Therefore, we set out to investigate the central question whether RNA sequences regulate eIF4A1 activity and function.
Here, we show that while eIF4A1 activity is governed by the length of a single-stranded RNA stretch the major determinant for activation is the nucleotide sequences of the single-stranded region. We find that eIF4A1 interacts with RNA single-strands in a sequence-dependent manner involving a process in which eIF4A1 multimerises particularly on AG-rich RNA sequences. Our data shows that eIF4A1-multimerisation stimulates site-directed unwinding of local RNA structure to specifically facilitate translation of otherwise repressed mRNAs. mRNAs that use this mechanism of eIF4A1 regulation encode for components of cell cycle regulation and mTORC-signaling. Our model of eIF4A1 regulation by single-stranded RNA sequences is supported by (a) in vitro experiments demonstrating that eIF4A1 performs RNA sequence-specific activities that are most stimulated by AG-repeat sequences, (b) a transcriptome-wide analysis revealing that mRNAs containing AG-repeat motifs in their 5'UTRs show pronounced gain of RNA structure in their 5'UTR and display strongly reduced translation rates following inhibition of eIF4A1 with hippuristanol and (c) a mechanistic investigation showing that eIF4A1 multimerises upon binding to AG-rich single-stranded RNA sequences, directly loading eIF4A1 onto proximal RNA structures and thus activating unwinding. Altogether, our data demonstrate that AG-RNA sequences regulate eIF4A1 function to drive translation of eIF4A1-dependent mRNAs with localised repressive RNA structures, including mRNAs critical for cell cycle progression.

Cell lines
Hela and MCF7 cells were purchased from ATCC for this study and were already authenticated. For HeLa cells, inhouse authentication using Promega GenePrint 10 was also performed and confirmed Hela identity. MCF7 cells were additionally authenticated by Eurofins using PCR-singlelocus-technology. All cell lines were tested on a two-weekly basis for mycoplasma. All tests were negative and confirmed the absence of mycoplasma contamination.

Cell culture and transfection for FLIM experiments
HeLa cells were seeded with cell density of ∼120 000 cells per dish (35 mm sterile MatTek, glass bottom) in DMEM (Gibco) supplemented with 10% FBS (Gibco) and 2 mM final concentration of L-glutamine (Gibco). Cells were transfected with 1 g of plasmid, or 1 g each in case of co-transfections, using GeneJammer (Agilent) at a reagent:plasmid ratio of 3:1. At 48 h post transfection, medium was exchanged for DMEM (Gibco) supplemented with 10% FBS (Gibco) and 2 mM final concentration of L-glutamine (Gibco) and cells dishes were taken for FLIM measurements.

Biomass production for generation of recombinant proteins
All proteins were heterologously produced in E. coli BL21 (DE3) CodonPlus-RP as N-terminal 6xHis-SUMO-fusion proteins, following procedures as reported in our previous work (23). Except for eIF4G, recombinant proteins were produced applying standard protocols for IPTG-induction. Briefly, main cultures were inoculated from overnight precultures. Main cultures were then grown to OD 600 = 0.8-1 before protein production was induced with a final concentration of 1 mM IPTG. Cells were harvested 4h post induction. For eIF4G, cells were first cultivated at 37 • C to an OD 600 = 0.6-1 before cells were cooled down to 20 • C and protein production induced with IPTG for 16 h. Cells were harvested by centrifugation and stored at −80 • C.

Protein purification
Recombinant proteins were purified following procedures as reported in our previous work (23). Cells were resuspended and lysed in buffer A [20 mM Tris/HCl, pH 7.5, 1 M NaCl, 30 mM imidazole and 10% (v/v) glycerol] supplemented with 1 mM PMSF and complete EDTA-free protease inhibitor cocktail (Roche). After centrifugation at 45 000 × g supernatant was filtered (0.45 m) and applied to HisTrap (GE Healthcare) affinity chromatography. Bound protein was eluted with a linear imidazole gradient. Pooled fractions were diluted in buffer B [20 mM Tris/HCl, pH 7.5, 10% (v/v) glycerol, 0.1 mM EDTA, 2 mM DTT] and incubated with SUMO-protease over night at 4 • C for cleavage of the SUMO-tag. The protein solutions were further diluted with buffer B and eIF4A1 fractions subjected to a ResourceQ (GE Healthcare) anion exchange column, and eIF4G-MC and eIF4H fractions subjected to Heparin (GE Healthcare) affinity column. Bound protein was eluted with a linear KCl gradient from 100 to 1000 mM KCl. Pooled fractions were further purified by size exclusion chromatography using a Superdex 200 column equilibrated in storage buffer [20 mM Tris/HCl, pH 7.5., 100 mM KCl, 0.1 mM EDTA, 1% (v/v) glycerol, 1 mM TCEP]. Pooled fractions were concentrated, flash-frozen in liquid nitrogen and stored at -80 • C. Protein concentrations were calculated from the absorbance at 280 nm (A280) using extinction coefficients obtained from ExPASy server (Supplementary  Table S7). All protein preparations showed an A280/A260 ratio ≥1.8; for eIF4H the ratio was ≥1.5, indicating negligible amounts of contamination by nucleic acids and nucleotides.

Ribooligonucleotides
RNAs used in this study were purchased from IBA Lifescience and Integrated DNA Technology and are listed in Supplementary Table S8. were pre-incubated with all components except eIF4A1 for 10 min. Data were normalised using the respective total signal change per condition.

Fluorescence-based RNA-binding
For RNA-release experiments, protein-RNA complexes were formed by incubation of 50 nM FAM-labelled RNA with 5 M protein in AB ± 100 M silvestrol + 2 mM ATP in the absence of magnesium. Binding and ATPasedependent RNA release was initiated by addition of magnesium chloride to a final concentration of 2 mM.
For FRET-based RNA-binding ( Figure 5E and Supplementary Figure S6L), 50 nM Cy3-Cy5-labelled RNA duplex substrate was incubated alone or with 3 M eIF4A1 in AB in the presence of 2 mM AMPPNP/MgCl 2 for 60 min. Competitor AG-RNA was then added to scavenge excess eIF4A1 as indicated in the figures. Fluorescence-emission spectra in the range 540-800 nm were recorded by excitation at 520 nm. Spectra were corrected for Cy5-emission collected from reactions containing only the Cy5-labelled strand. Corrected spectra were then normalised to the maximum Cy3-fluorescence at 565 nm. Relative FRET was calculated according to the equation Fluorescence intensities and anisotropy were measured using a Victor X5 (Perkin Elmer) or Spark (Tecan). Dissociation constants and half-lives were obtained from fitting the experimental data to the Hill-and single-exponential decay equation using Prism GraphPad 7, 8 or 9.
Electrophoretic mobility shift RNA-binding 25 nM Dy680-or Dy780-labelled RNAs were incubated with indicated proteins in AB + 2 mM AMPPNP/MgCl 2 in the presence and absence of 100 M silvestrol or 50 M hippuristanol in 10 l reactions for 60 min at 25 • C.
In clamping experiments in Figure 3H-I, eIF4A1 was preincubated with RNA and silvestrol in AB + 2 mM MgCl 2 in the absence of nucleotide for 60 min at 25 • C before competitor AG-RNA was added.
A final concentration of 2% (w/v) Ficoll-400 was added to the samples and complexes separated on 6-7% acrylamide-TB gels at 100 V for 50 min at room temperature using 0.5× TB as running buffer. When binding of eIF4A1 to the unwinding substrate was analysed, gels were run at 4 • C. Gels were scanned immediately after the run with Odyssey (Licor) and band intensities quantified using Image Studio Lite. Dissociation constants were obtained from fitting the experimental data to the Hill-equation using Prism GraphPad 7, 8 or 9.
Analytical gel filtration eIF4A1 alone or with RNA was incubated for 1 h in AB supplemented with 2 mM AMPPNP/MgCl 2 ± 100 M silvestrol at room temperature at concentration of 16 M and 4 M, respectively, if the protein was in excess; or at 4 and 12 M, respectively, if the RNA was in excess. Samples were loaded onto a S200 increase 3.2/300 (2.4 ml) that was equilibrated in AB + 2 mM MgCl 2 without AMPPNP. Ovalbumin (45 kDa) and Conalbumin (75 kDa) were used as molecular weight standards.

Analytical ultracentrifugation
All analytical ultracentrifugation experiments were performed at 50 000 rpm, using a Beckman Optima analytical ultracentrifuge with an An-50Ti rotor at 20˚C. Data were recorded using the absorbance optical detection system. For characterisation of the individual protein, sedimentation velocity (SV) scans were recorded at 280 nm in AB ± 2 mM AMPPNP/MgCl 2 ± 100 M silvestrol. For characterisation of the individual RNA samples Dy780-(AG) 5 and 6-FAM-(AG) 10 , SV scans were recorded at 766 and 495 nm, respectively, in AB ± 2 mM AMPPNP/MgCl 2 ± 100 M silvestrol. For characterisation of the protein in complex with either Dy780-(AG) 5 or 6-FAM-(AG) 10 , SV scans were recorded at 766 or 495 nm, respectively, in either assay buffer ± 2 mM AMPPNP/MgCl 2 ± 100 M silvestrol.
The density and viscosity of the buffer was measured experimentally using a DMA 5000M densitometer equipped with a Lovis 200ME viscometer module. The partial specific volume of the protein was calculated using SEDFIT from the amino acid sequence. The partial specific volume of the RNA was calculated using NucProt from the nucleotide sequence. The partial specific volumes of RNA:protein complexes with different stoichiometries were calculated using the equation:ṽ where M P andṼ P denote the molecular mass and partial specific volume of the protein, respectively, and M R andṼ R denote the molecular mass and partial specific volume of the RNA, respectively. Data were processed using SEDFIT, fitting to the c(s) model.

Fluorescence lifetime imaging
Fluorescence lifetime measurements in live cells were conducted as described previously (33). Briefly, a Lambert Instruments fluorescence system attached to a Nikon Eclipse

Complex formation between eIF4A1 and cofactors
For experiments that included complexes between eIF4A1 and cofactors or combinations thereof, unless otherwise stated proteins have been preincubated in AB in the absence of RNA and nucleotides for at least 60 min before RNA was added.

Real-time fluorescence-based unwinding
For titrations, 50 nM annealed substrate were incubated with indicated proteins in AB in 18 l reactions in the presence (clamping conditions) or absence (non-clamping conditions) of 100 M silvestrol in 384-well plates and incubated for 1 h at 30 • C. Protein dilutions were prepared using storage buffer. Under scavenging conditions ( Figure 3G-I) 2 M or indicated concentrations of AG-RNA was added after the pre-incubation step and allowed to scavenge excess eIF4A1 for another 60 min.
In pre-clamping experiments i.e. when eIF4A1 wt or eIF4A1 DQAD were pre-bound to the RNA substrate before addition of the next protein ( Figure 4B), 1 M indicated eIF4A1 variant was incubated with the RNA substrate in the absence of nucleotide for 1 h before additional eIF4A1 was added to the reaction.
When fractional mixes of eIF4A1 wt and eIF4A1 DQAD were used, they were first premixed at 50 M (10× stock) concentration in storage buffer before added to the reaction mixtures.
Reactions were started by addition of ATP-MgCl 2 to a final concentration of 2 mM and fluorescence readings taken in an InfinitePro M200 (Tecan) or Spark (Tecan) with excitation at 535 nm and emission at 575 nm. Data were analysed as described previously (26,34). Data were fitted to a linear or single-exponential equation to yield the initial rate of unwinding as well as the total fraction unwound, respectively. Unless stated otherwise, secondary data were further analysed for the Hill-equation using Prism (GraphPad 7, 8 or 9).

ATPase assay
ATPase reactions were carried out side-by-side from the same master mix as the fluorescence-based unwinding assays. In separate reactions, NADH (Sigma), phosphoenolpyruvate (Sigma or Alfa Aesar) and lactate dehydrogenase/pyruvate kinase mix (Sigma) were added to unwinding reactions to a final concentration of 2 mM, 2 mM and 1/250 (v/v), respectively. NADH turnover was monitored by measuring absorbance at 340 nm. Obtained absorbance data were converted to the concentration of NADH using condition and machine specific ε (NADH) of 0.62 mM −1 . ATPase rates were obtained from a linear fit to the experimental data using Prism (GraphPad 7, 8 or 9).

Unwinding gel shift
All reactions were prepared from the same master mix and split accordingly for the following different conditions. 50 nM annealed substrate was incubated with 3 M eIF4A1 in AB supplemented with 2 mM MgCl 2 in 10 l reactions in the presence or absence of 100 M silvestrol and incubated for 1 h at room temperature. Under scavenging conditions, a final concentration of 2 M AG-RNA was added after the preincubation step and allowed to scavenge excess eIF4A1 for another 60 min. Reactions were then started by addition of a final concentration of 2 mM ATP. Reactions were quenched after another 60 min with stop solution (0.5× TBE, 0.2% (w/v) SDS, 50 mM EDTA, pH 8), or, if RNAbound complexes were to be resolved, only 2% (w/v) Ficoll-400 was added. Samples were subjected to gel electrophoresis on discontinuous 10%-acrylamide TB/18%-acrylamide-TBE gels. Gels were run at 200 V at 4 • C and immediately scanned using an Odyssey instrument (LICOR).

Small-angle X-ray scattering
Samples contained 100 M eIF4A1 alone, 100 M eIF4A1 with 30 M either AG-RNA or AG-overhang substrate to generate multimer eIF4A1-RNA complexes, or 60 M and 100 M eIF4A1 with 60 M AG-RNA or 100 M CAA-RNA, respectively, to generate monomer complexes in AB supplemented with 2 mM AMPPNP/MgCl 2 and 100 M silvestrol. Samples were kept at a concentration of approximately 5 mg/ml, frozen in liquid nitrogen, and shipped to Diamond Light Source on dry ice. The protein was applied to a Superdex 200 Increase 3.2 column, at 0.16 ml/min, before being exposed to the X-ray beam, as part of the standard set up at station B21. Data were analysed using ScÅtter version 3.2 h. Seventeen ab initio models were calculated by DAMMIF (35), and average models of these were calculated using DAMAVER and DAMFILT (36). Reported resolution of the space-filled models was calculated using SASRES (37). Superpositions of ab initio models were calculated by SUPCOMB (38) or SITUS (39). Distances shown in Supplementary Table S5 are the mean ± SD based on four individual measurements using PyMOL2. Volumes are the results from data analysis using ScÅtter.

Reporter mRNA construction
Plasmids containing the desired cDNAs were constructed using annealed oligos (see Supplementary Table S9). For native 5'UTR reporters, gene-blocks of the sequences with flanking 5' HindIII and 3' NcoI restriction sites were purchased from IDT. All sequences were cloned into the pGL3promoter plasmid (Promega E1761) between the HindIII and NcoI restriction sites, directly upstream of the FLuc open reading frame followed by a 3'UTR and an (A) 49 sequence. Plasmids were linearised with NsiI located directly downstream of the (A) 49 sequence and treated with Klenow fragment (NEB M0210S) to generate blunt-ends. RNA was then transcribed from the NsiI-linearised plasmids with the HiScribe™ T7 ARCA mRNA Kit (NEB E2065S) or HiScribe™ T7 mRNA Kit with CleanCap® Reagent AG (NEB E2080S) as per the manufacturer's instructions. RNAs were purified by acid-phenol chloroform extraction and ethanol precipitation with ammonium acetate and the concentration was quantified spectroscopically and RNA integrity checked by formaldehyde denaturing agarose gel electrophoresis. RNA was stored at −80 • C.

In vitro translation assay
1 ml nuclease-untreated Rabbit Reticulocyte Lysate (Promega L4151) was supplemented with 25 M haemin, 25 g/ml creatine kinase, 3 mg/ml creatine phosphate, 50 g/ml liver tRNAs and 3 mM glucose and aliquoted and stored at −80 • C. 50 ng Firefly-luciferase (FLuc) reporter constructs were mixed with storage buffer or storage buffer supplemented with recombinant 4E-BP1 (Sino Biological, 10022-H07E), eIF4A1 wt or eIF4A1 E183Q (eIF4A1 DQAD ) at room temperature in a volume of 4 l. Reactions were prepared in technical duplicates, incubated at 30 • C and luciferase activity monitored in real time for 1-2h using a Tecan Spark plate reader. Readings from duplicates were averaged and the maximum translation was extracted as the maximum increase in firefly luciferase (FL) activity over time (slope) (40). FL activities are shown relative to the FL-activity of the CAA reporter (set to 1) per respective condition.

Reporter translation assay in MCF7 cells
MCF7 were seeded into a 96-well plate at a density of 10 000-20 000 cells in DMEM (Gibco) supplemented with 10% FBS (Gibco) and 2 mM final concentration of Lglutamine (Gibco) [hereafter DMEM] at least one day before the experiment. On the day of the experiment, medium was replaced by fresh DMEM supplemented with 150 nM hippuristanol or DMSO control. At 5 h of treatment, 150 ng capped reporter firefly luciferase (FL) mRNA, and 30 ng HCV renilla luciferase (RL) mRNA were added to the medium and transfected using Lipofectamin 2000 (Invitrogen). At time points 1, 2 and 3 h post RNA transfection (hours 6, 7 and 8 of hippuristanol-treatment), medium of to-be-sampled wells (1 well per time point and condition) was aspirated and 35 l 1× passive lysis buffer (Promega) were added to the wells. After 15 min incubation at 37 • C lysed samples were transferred into new 96-well plates. FL and RL activity of 2 × 10 l (technical duplicates) were measured with the Dual-Luciferase Assay system (Promega) in a GloMax (Promega) machine using manufacturer protocols. For each reporter individually, FL activities from the technical duplicates were normalised to RL levels per time point across conditions and plotted against the time points. Linear regression of the data was performed to yield the apparent translation rate k as the slope of the fitted line.

RNA structure-seq2 analysis
To assess changes in RNA structure surrounding polypurine rich sequences, we interrogated our previously published Structure-seq2 data set from MCF7 cells (12), which measured changes in reactivity of RNA structure to dimethyl sulphate (DMS) upon specific inhibition of eIF4A with hippuristanol. DMS modifies non basepaired As and Cs, hence DMS-reactivity is a measure of single-strandedness. The reactivity data are available at the Gene Expression Omnibus (GEO) database accession GSE134865, which can be found at https://www.ncbi.nlm. nih.gov/geo/query/acc.cgi?acc=GSE134865.
To identify all non-overlapping polypurine (R10) sequences in the data set, where R refers to a purine, we made use of the react composition.py script from the StructureFold2 package of scripts (41), which is available from GitHub using the following link https://github.com/ StructureFold2/StructureFold2. To exclude A 10 and G 10 motifs in our group of polypurine motifs, we only included 10nt 100% R motifs in the analyses with a maximum of 7/10 being purely As or Gs, i.e. 100% R excluding motifs with more than 8 As or Gs. This script outputs the reactivity changes at all motifs and a user defined size either side of the identified motif. Using these data we then filtered the output to the coverage and 5' end coverage thresholds used previously (12) and picked the most abundant transcript per gene with a 5'UTR length of more than 100 nt. All group sizes are summarised in Supplementary  Table S1.
For plots Figure 2H and Supplementary Figure S2H only those motifs that were positioned at least 50 nt from a UTR/CDS boundary or the 5' or 3' end of the transcript were included. This identified 608 R10 motifs in the 5'UTRs of 358 transcripts, 6927 R10 motifs in the CDSs of 1906 transcripts and 2761 R10 motifs in the 3'UTRs of 1303 transcripts. Random motifs were selected using a sliding window analysis (same constraints as R10analysis, 20 nt windows with 10 nt steps) using the react windows.py script from StructureFold2. The same number of random motifs as R10 motifs were selected from each transcript.
The minimum free energy (MFE, Supplementary Figure  S2J) of predicted folds was calculated by folding the 50 nt windows shown in Figure 2H centred on the 31:50 downstream window directly downstream of all R10 or random motifs using the batch fold RNA.py, which uses RNAstructure (version 6.1) (42) and extracting the metrics with the structure statistics.py scripts from the StructureFold2 package.
All panels were created using the custom R scripts R10 analysis 1.R and R10 analysis 2.R, which are available at GitHub using the following link https://github.com/Bushell-lab/Structure-seq2-withhippuristanol-treatment-in-MCF7-cells. The box plot shows the median (centre line), the upper and lower quartile (box limits), the 1.5× interquartile range (whiskers) and in Supplementary Figure S2J the mean (dot). Outliers (>1.5× interquartile range) are not shown.

Calculation of total cellular mRNA concentration
The concentration of mRNAs for a typical HeLa cell was calculated assuming a cell volume of 2425 m 3 (BNID: 103725 (43) and (44)) and an average mRNA copy number of 300 000 per cell (BNID: 104330 (43) and (45)).

TMT-pulsed SILAC
MCF7 cells were cultivated in SILAC DMEM (Silantes) supplemented with 10% dialysed FBS (Sigma), 2 mM Lglutamine (Gibco), 0.789 mM Lys-12 C 6 14 N 2 (Lys0) and 0.398 mM Arg-12 C 6 14 N 4 (Arg0), referred to as light-DMEM, for at least five doubling times. All isotope-labelled amino acids were purchased from Cambridge Isotope Laboratories with an isotope purity >99%. For metabolic pulselabelling, cells were then split using light-DMEM, allowing settling overnight, followed by treatment on the next day with either 150 nM hippuristanol (0.8% DMSO stock) or DMSO control for eight hours in SILAC DMEM (Silantes) supplemented with 10% dialysed FBS (Sigma), 2 mM Lglutamine (Gibco), 0.789 mM Lys-13 C 6 15 N 2 (Lys8) and 0.398 mM Arg-13 C 6 15 N 4 (Arg10), referred to as heavy-DMEM. Samples were taken immediately at the beginning of treatment (time = 0 h) and after two, four and eight hours after medium swap and treatment. Cells were harvested, washed in PBS and lysed in 6 M urea, 2 M thiourea, 50 mM Tris/HCl pH 8.5, 75 mM NaCl using sonication, and cleared by centrifugation. Supernatants were stored at −80 • C. For all time points a biological quadruplet was generated before submission to MS. 25 g protein lysate was reduced with 5 mM DTT, then alkylated in the dark with 50 mM IAA. Samples were then subject to a two-step digestion, firstly with Endoproteinase Lys-C (ratio 1:33 enzyme:lysate) (Promega) for 1 h at room temperature then with trypsin (ratio 1:33 enzyme:lysate) (Promega) overnight at 37 • C. Once digested, peptide samples were labelled with TMT 16plex reagent kit (Thermo Scientific). 400 g digested sample was fractionated using reverse phase chromatography at pH 10. Solvents A (98% water, 2% ACN) and B (90% ACN, 10% water) were pH adjusted to pH 10 using ammonium hydroxide. Samples were run on an Agilent 1260 Infinity II HPLC. Samples were manually injected using a Rheodyne valve. Once injected the samples were subjected to a two-step gradient, 2-28% Solvent B in 39 mins then 28-46% Solvent B in 13 min. The column was washed for 8 mins at 100% Solvent B followed by a reequilibration for 7 min. Total run time was 76 mins and flow rate was set to 200 l/min. The samples were collected into 21 fractions.
Peptide samples were run on a Thermo Scientific Orbitrap Lumos mass spectrometer coupled to an EASY-nLC II 1200 chromatography system (Agilent). Samples were loaded onto a 50 cm fused silica emitter (packed inhouse with ReproSIL-Pur C18-AQ, 1.9 m resin) which was heated to 55 • C using a column oven (Sonation). Peptides were eluted at a flow rate of 300 nl/min over three optimised two-step gradient methods for fractions 1-7, 8-15 and 16-21. Step one was commenced for 75 min and step two for 25 min. For fractionated samples 1-7, % of solvent B was 3-18% at step one and 30% at step two. For fractions 8-15, % of B was 5-24% at step one and 38% at step two and for fractions 16-21, % B was from 7-30% at step one and 47% at step two. Peptides were electrosprayed into the mass spectrometer using a nanoelectropsray ion source (Thermo Scientific). An Active Background Ion Reduction Device (ABIRD, ESI Source Solutions) was used to decrease air contaminants.
Data were acquired using Xcalibur software (Thermo Scientific) in positive mode utilising data-dependent acquisition. Full scan mass (MS1) range was set to 350-1400 m/z at 120 000 resolution. Injection time was set to 50 ms with a target value of 5E5 ions. HCD fragmentation was triggered at top speed [3 s] for MS2 analysis. MS2 injection time was set to 175 ms with a target of 2E5 ions and resolution of 15 000. Ions that have already been selected for MS2 were dynamically excluded for 30 s.
Data were processed following recommendation from Zecha et al. (46) MS raw data were processed using MaxQuant software (47) version 1.6.14.0 and searched with the Andromeda search engine (48) against the Uniprot Homo sapiens database (2018, 95 146 entries). Data were searched with multiplicity set to MS2 level TMT16plex. First and main searches were done with a precursor mass tolerance of 20 ppm for the first search and 4.5 ppm for the main. MS/MS mass tolerance was set to 20 ppm. Minimum peptide length was set to seven amino acids and trypsin cleavage was selected allowing up to two missed cleavage sites. Methionine oxidation and Nterminal acetylation, SILAC Arg10, SILAC Lys8 were selected as variable modifications and Carbimidomethylation as a fixed modification. False discovery rate was set to 1%.
MaxQuant output was processed using Perseus software (49) version 1.6.15.0. The MaxQuant Evidence.txt file was used to create a new protein groups file. In short, data were culled of contaminant, reverse and unique proteins only peptide identifications before identifying the TMT reporter ion intensities that contain the variable SILAC Arg10 & Lys8 modifications. Identical peptide sequences were combined by median. The data was then exported to R and a script run to combine the 'TMT reporter intensity corrected' peptide sequences that belong to the same protein into a 'protein group' TMT reporter intensity value (R script available upon request). Protein level data were normalised by LIMMA to account for batch effect differences.
Further data processing and analyses followed recommendation from Zecha et al. (46) To normalise and focus on newly synthesised proteins, which contain the heavy-label (Arg10, Lys8) TMT intensity (I Lys8,Arg10 ), custom R-scripts were used to convert the per-gene intensity data into fraction 'heavy' (F H ) by dividing TMT-intensities from 'heavylabelled' by the sum of light-and heavy-labelled TMT intensities (F H = I Lys8,Arg10 /(I Lys8,Arg10 + I Lys0,Arg0 ). Considering the limited time points that were collected and to calculate the apparent translation rate of newly synthesised protein, a linearised first order equation for labelling kinetics (single exponential growth) was fitted to the logarithmic data, which yields the apparent translation rate k as the slope of the fit (ln F H = k * t + offset). To remove poor quality fits, data for the rate k for both control and hippuristanol conditions were filtered using a Pvalue cut-off of 0.1 (F-test, null-hypothesis of k = 0; input: 1337 proteins, P < 0.1: 1270, P > 0.1: 67). Next, difference and log 2 -fold change (hippuristanol/DMSO) and associated false discovery rates (FDR) were calculated using standard procedures. Proteins were grouped eIF4A1-dependent or-independent if their FDR of the difference between the apparent translational rate under hippuristanol and DMSO control was smaller than 0.1 or larger than 0.7, respectively. This resulted in 255 eIF4A1-dependent and 244 eIF4A1independent genes. Motif identification for AG5-and GC5motif has been done as described under the RNA structure-seq2 section. For Figure 2I, the same DMS reactivity data as in Figure 2H was used, and windows were categorised as decrease or increase in RNA structure if their change in DMS reactivity was lower or above 0. In all figures where applicable data is filtered for most abundant transcripts using RNAseq data from our previous study Waldron et al.

Statistical analysis
If not stated otherwise, n is the number of independent biological replicates of the described experiment and is given in the figure legends. Quantitative experiments including unwinding and ATPase assays and FLIM-FRET were performed in technical duplicates per biological replicate, the average of which was used for downstream analysis. Except for statistical tests based on sequencing data, significance was determined using a two-tailed and unpaired t test.
Where applicable P-values were corrected for multiple testing by calculating FDRs. Group sizes are summarised in Supplementary Table S1. Statistical significances are given as the absolute, adjusted p-values in the figures or figure legends.

RESULTS
Single-stranded RNA sequences modulate the level of eIF4A1-unwinding and eIF4A1-dependence of mRNA translation in vitro eIF4A1 is considered to bind RNA sequences nonspecifically (50,51). However, a more recent study by Iwasaki et al., in which short RNAs from a library bound by recombinant eIF4A1 were sequenced after immunoprecipitation (bind-n-seq) (30), suggested sequence preferential binding of eIF4A1 (Supplementary Figure S1A), but the effect of the individual nucleotide sequences within the single-stranded RNAs on the catalytic capacities of eIF4A1 was not investigated. To examine this, we first measured the RNA unwinding and ATPase activity of recombinant eIF4A1 (Supplementary Figure S1B) in vitro using RNA substrates containing an identical 24 bp duplex with 20 nt 5' overhang sequences that were expected to span a range of RNA binding affinities based on the bind-n-seq experiment by Iwasaki et al. (Figure 1A) (30). This not only validated RNA sequence-specific binding of eIF4A1, but also showed that eIF4A1 unwinding activity was modulated by the nucleotide sequence of the single-stranded RNA. In particular, unwinding was most stimulated by AG-repeats and the least stimulated by UC-repeats (Figure 1A). RNA sequence-specific unwinding was also observed in the presence of cofactors eIF4H and eIF4G (Figure 1B), while differential RNA sequence-specific affinities of eIF4A1 were almost abolished. In comparison the RNA sequence of the single-stranded overhang had nearly no effect on eIF4A1's ATPase activity (Supplementary Figure  S1C). Hence, single-stranded RNA sequences mainly influenced unwinding by eIF4A1, which could not entirely be explained by differential RNA binding affinities nor ATPase activities alone ( Figure 1B). Further, this also suggested that the nucleotide sequence of single-stranded RNA modulates eIF4A1 unwinding through a mechanism distinct from the effect of its length as described previously for yeast eIF4A (27).
We next asked, if this differential, sequence-dependent unwinding has a functional impact on translation. For this, we employed luciferase-reporter translation assays in vitro to specifically examine the effect of the tested sequences with the largest differential in unwinding, i.e. AG-and UC repeats. Assays were performed in nuclease-untreated rabbit reticulocyte lysate (RRL) with capped mRNA constructs that contained a linear 5'UTR of CAA-repeats ± a double stem-loop (SL, 2 × 11 bp and 4 nt loop) ± a 20 nt AG-or UC-repeat positioned upstream of the SL ( Figure 1C). Position and design of similar SLs have previously been shown to inhibit scanning rather than recruitment of reporter mR-NAs (12,(52)(53)(54). While the SL effectively repressed translation, the presence of an AG-box upstream of the SL, strikingly, led to de-repression ( Figure 1D). In contrast, no such    effect was observed on the translation of the linear reporters ( Figure 1D) nor when an AG-repeat RNA was added in trans to the SL-reporter (Supplementary Figure S1D), nor with the UC sequence instead ( Figure 1D and Supplementary Figure S1E). Moreover, de-repression was sensitive to the location of the AG-motif relative to the SL, with a strong 5'-3' directional bias ( Figure 1E-F and Supplementary Figure S1F-G). Altogether this strongly suggested that de-repression of the structured reporter by the AG-repeat is specific ( Figure 1D). This was supported by using RNA substrates corresponding to the reporter 5'UTRs, showing that both eIF4A1 binding affinity and unwinding was stronger if the 5'UTR contained the AG-repeat (Supplementary Figure S1H-J) and confirmed preferential unwinding in 5'-3' direction (and Supplementary Figure S1K-L).
To investigate if AG-dependent de-repression is capdependent, we added recombinant 4E-BP1, an eIF4E-cap binding inhibitor, to the reactions, or used mRNA constructs with a non-functional ApppG-cap (A-cap), which both showed that de-repression by site-specific unwinding requires cap-dependent translation initiation ( Figure  1G). To next examine the specific role of eIF4A1 for AGdependent de-repression, we added recombinant eIF4A1 wild-type (eIF4A1 wt ) or eIF4A1 E183Q, which is catalytically inactive (hereafter named eIF4A1 DQAD , Supplementary Figure S1M) (55) to the reactions. While addition of eIF4A1 wt stimulated translation particularly of the SL reporters, eIF4A1 DQAD strongly inhibited translation of all reporters demonstrating the strict dependency of reporter translation on eIF4A1 ( Figure 1H).
In summary, eIF4A1 interacts with single-stranded RNA in a sequence-specific manner, which results in RNAspecific activation of RNA unwinding that favours translation of structurally repressed reporter mRNAs in capdependent translation in vitro rendering mRNA translation more eIF4A1-dependent ( Figure 1I).

Purine-rich 5'UTR sequences modulate the level of eIF4A1dependence of mRNA translation in cells by activating localised eIF4A1-unwinding
To examine the global connection between primary RNA sequence and eIF4A1-dependency of specific mRNA on translation in cells, we applied metabolic pulse-labeling together with quantitative TMT labelling (TMT-pSILAC) in MCF7 cells over a time course immediately following inhibition of eIF4A1 with hippuristanol, which prevents eIF4A1 RNA-binding and unwinding (14,15) (Figure 2A). To measure the associated change in translation per protein in response to eIF4A1-inhibition, we calculated the apparent translation rate of newly synthesized proteins (k hippuristanol , k DMSO , Figure 2B). The experiment was performed in quadruplet which uniquely allowed us the capacity to confidently measure direct changes in protein synthesis rates following eIF4A1-inhibition (Supplementary Figure S2A-B). In agreement with hippuristanol being a translational inhibitor (12,(14)(15), translation rates were nearly exclusively downregulated in response to the treatment (Supplementary Figure S2C, D), analysis revealing 255 hippuristanol-sensitive/eIF4A1-dependent mR-NAs (254 repressed, 1 upregulated) and 244 hippuristanol-resistant/eIF4A1-independent mRNAs (Figure 2C, D and  Supplementary Tables S1 and S2).
To validate these mRNA families and examine that eIF4A1-dependency was mediated by the 5'UTRs of these mRNAs, we randomly selected ten and five eIF4A1dependent and -independent mRNAs, respectively, and cloned their 5'UTRs sequences into firefly luciferase reporters (scheme Figure 2E and Supplementary Table S3). To recapitulate the setup of the SILAC experiment, translation rates of the reporter mRNAs were derived from normalised luciferase activity over time ( Supplementary Figure S2E), following a 5 h period of hippuristanol treatment (scheme Figure 2E). The relative level of translational repression by hippuristanol varied between the different eIF4A1-dependent reporters ( Figure 2E), each of them showed a stronger response than any of the reporters with eIF4A1-independent 5'UTRs ( Figure 2E). Altogether, reporters with eIF4A1-dependent 5'UTRs were significantly more sensitive to hippuristanol-treatment than reporters with eIF4A1-independent 5'UTRs ( Figure 2F). This demonstrated that the 5'UTR sequences of the selected mRNAs are sufficient to establish differential eIF4A1dependency.
Previous data sets, that investigated the change in translational efficiency following eIF4A1-inactvation, highlighted global 5'UTR features including length, stability and GC-content as markers rendering mRNA translation eIF4A1-dependent (5,(12)(13)23,30,56). Interestingly, these features were not different between eIF4A1-dependent and -independent mRNAs (Supplementary Figure S2F). Additionally, examining the AG-content within the 5'UTRs of these mRNA also showed no global distinction between the two groups of mRNAs (Supplementary Figure S2F). Taken together, this suggested that other, less global mRNA features are responsible for eIF4A1-dependence.
To investigate the role of RNA sequence motifs for eIF4A1-dependent translation, we asked specifically if the presence of AG-sequence motifs within the transcript is associated with differential translation rates, mirroring the in vitro experiments ( Figure 1D). For this, we grouped mR-NAs if their 5'UTRs contained non-overlapping 10 nt AG motifs (see methods) or, as a reference, GC-repeats (GC5), a previously highlighted marker for eIF4A1-dependence of translation (5,(11)(12)(13). This showed that translation rates of eIF4A1-dependent mRNAs with AG5-motifs in their 5'UTRs were significantly stronger repressed after eIF4A1inhibition ( Figure 2G), while, in contrast, the reference mRNA group with GC5-motifs within their 5'UTR was not associated with a change in translation rate upon eIF4A1inhibition. This suggested that presence of AG5-motifs in the 5'UTR of eIF4A1-dependent mRNAs increases their requirement of eIF4A1 activity for translation.
We then asked if the stronger translational repression of AG5-motif containing eIF4A1-dependent mRNAs is related to structural rearrangements induced by inhibition of eIF4A1 activity with hippuristanol treatment. For this, we first wanted to understand if the eIF4A1-dependent changes in translation rates are generally associated with changes in RNA structure. To do so, we took advantage of our previous Structure-seq2 data (12) that have also been obtained in MCF7 cells following specific inhibition  Figure S2G) (14,15). To evaluate the change in RNA structure, we compared the DMS-reactivity (hippuristanol -control), i.e. the change in single-strandedness, of eIF4A1-dependent and -independent mRNAs. This revealed that, following eIF4A1-inhibition, changes in global RNA structure between the two groups of mRNAs were similar in each RNA region (5',3'UTR and CDS) (Supplementary Figure  S2G). This agrees with our previous findings that eIF4A1inhibition does not affect mRNA structure globally (12). Further, previous studies, including ours examining specifically eIF4A1 (12), have shown that DEAD-box RNA helicases rearrange localised RNA structures (57,58). To specifically test whether AG5-motifs guide local unwinding of RNA structure in an eIF4A1-dependent manner, we compared the change in RNA structure ( DMS-reactivity) in 20 nt sliding RNA regions up-and downstream of AG5 motifs (Supplementary Figure S2H). The analysis revealed that in the 5'UTR the content of RNA structure in RNA regions downstream of AG5 motifs increases significantly upon eIF4A1-inhibition ( Figure 2H and Supplementary Figure  S2H), while this is not observed for RNA regions upstream ( Figure 2H and Supplementary Figure S2H) or around randomly selected non-AG5 motifs within the same 5'UTRs ( Figure 2H). Neither were site-specific changes in RNA structure observed for RNA regions around AG5-motifs in the CDS or 3'UTR of the same transcripts ( Figure 2H). Interestingly, the location of the AG5 motifs in the 5'UTR was unbiased (Supplementary Figure S2I) and the stabilities of RNA structures folded from the DMS-reactivities of the RNA regions downstream of these 5'UTR-AG5 motifs were not different from the stabilities calculated from random locations within the same 5'UTR (same AG5 and random regions as in Figure 2H, Supplementary Figure S2J). Altogether, this strongly suggested site-specific eIF4A1dependent unwinding downstream of the AG5 motifs in the 5'UTR of eIF4A1-dependent mRNAs (scheme Figure 2H). Finally, we asked if changes in RNA structure in these eIF4A1-unwinding dependent RNA regions downstream of the AG5 motifs in the 5'UTR affected translation of the mRNA (see also Figure 2G). For this we paired the change in RNA structure of these RNA regions ( Figure 2H) with the change in translation rate ( Figure 2D) of the respective mRNA following eIF4A1-inhibition. This revealed that gain of RNA structure in RNA regions downstream of the 5'UTR AG5 motifs was associated with translational re-pression following eIF4A1-inhibition ( Figure 2I, same AG5 motifs and regions as in Figure 2H). In contrast, this was not the case for random locations within the 5'UTR of the same transcript ( Figure 2I, same random motifs and regions as in Figure 2H) nor RNA structure upstream of these motifs (Supplementary Figure S2K). Thus, this strongly suggested that AG5-motifs stimulate eIF4A1-dependent unwinding of downstream RNA structure to facilitate mRNA translation (scheme Figure 2I). Gene set enrichment analysis suggests that mRNAs containing AG-motif that use this mechanism to activate eIF4A1 play a critical role in the translation of known mRNAs with proliferative signature including components of mTORC-signalling and cell cycle progression as well as myc targets ( Figure 2J).
Taken together, AG-rich RNA sequences in the 5'UTR site-specifically regulate eIF4A1 helicase activity to facilitate translation of eIF4A1-dependent mRNAs with local repressive RNA structure, including mRNAs critical for cell cycle progression.

RNA sequence-dependent unwinding by eIF4A1 is stimulated by eIF4A1-multimerisation
To assess how RNA sequences, in particular the AG-repeat sequences, specifically activate and stimulate eIF4A1 unwinding mechanistically, we next examined eIF4A1's catalytic capacities in more detail in vitro. For this we aimed to characterise the differential unwinding of substrates with AG and CAA-overhang, to which eIF4A1 displayed comparable affinities ( Figure 1A). Titrations confirmed similar functional binding affinity (K 1/2 ∼ 2 M, Figure 3A) and that unwinding activity on the CAA-overhang substrate was weaker compared to the AG-overhang ( Figure 3A). The functional binding isotherms of the curves for both overhang-sequences were sigmoidal and revealed a Hillcoefficient (h) > 1 ( Figure 3A). This was also observed at different substrate concentrations and duplex lengths (Supplementary Figure S3A, B) indicating cooperation of multiple eIF4A1 copies in the unwinding reaction, while, in contrast, the ATPase activity did not appear to require cooperation (Supplementary Figure S3C, h ≥ 0.5).
Since eIF4A1 only contains one active site for unwinding, enzymatic cooperativity would mean involvement of multiple eIF4A1 molecules in the reaction. We therefore sought to resolve putative multimeric eIF4A1-RNA complexes by native electrophoretic mobility shift assays.   Multimeric eIF4A1 complexes were clearly detectable with the unwinding-activating AG-RNA but not detectable with the less activating CAA-RNA ( Figure 3B, Supplementary Figure S3D). The eIF4A-inhibitor silvestrol promoted eIF4A1 multimerisation specifically on the AG-RNA but not CAA-RNA ( Figure 3B, Supplementary Figure S3E), while hippuristanol reduced RNA-binding of eIF4A1 to both RNAs and abrogated multimeric complex formation ( Figure 3B). Analytical gel filtration and ultracentrifugation revealed that eIF4A1 multimerisation is (i) only induced upon RNA-binding (Supplementary Figure S3F, G), (ii) most pronounced at excess eIF4A1 concentrations (Supplementary Figure S3F, G) and (iii) RNA sequence-specific ( Figure 3B-C). The largest state of multimeric eIF4A1complexes showed a stoichiometry of eIF4A1:AG-RNA of 3:1 as determined from analytical gel filtration and ultracentrifugation (Supplementary Figure S3F, G and Supplementary Table S4). Intracellular eIF4A1 is highly abundant over typical total mRNA concentrations (eIF4A1: 10-20 M (59), total mRNA < 1 M, see Materials and Methods) at an estimated eIF4A1:mRNA ratio of 6-50:1 (60), conditions at which we observe multimerisation in vitro. We then asked if eIF4A1 multimerises in cells, for which we employed fluorescence lifetime imaging-fluorescence resonance energy transfer (FLIM-FRET (61) Having detected the ability of eIF4A1 to multimerise on AG-RNA, we next performed a direct comparison of the unwinding activity of substrate-bound eIF4A1 when levels of multimerisation were either high or low. For this, conditions were required to allow binding of eIF4A1 to the substrate prior to the unwinding reaction. To avoid ATP, which is required for eIF4A1 to bind RNA and initiates unwinding, we used silvestrol because it clamps eIF4A1 onto RNA in an ATP-independent manner ( Figure 3F and Supplementary Figure S3L). This also allowed us to setup the reaction in a way so that the total protein as well as the substratebound concentrations between the conditions matched. Since protein excess is required for eIF4A1-multimerisation (Supplementary Figure S3F), we first clamped eIF4A1 to the AG-substrate under conditions that allow multimer formation and then, to reduce the degree of multimerisation, added subsequently unlabelled AG-RNA to scavenge excess eIF4A1 from the solution and the multimers ( Figure  3F, G). This revealed that highly multimerised eIF4A1 was fully active while lowly multimerised eIF4A1 displayed only residual unwinding activity even though the AG-RNA substrate was fully bound to eIF4A1 ( Figure 3H). In contrast, the increase in ATPase activity reflected binding of eIF4A1 to the added scavenger RNA, validating the active, functional state of the protein. Interestingly, increasing amounts of scavenger RNA reduced eIF4A1's unwinding activity already by over 70% before multimerisation of substrate-bound eIF4A1 was reduced, indicating that not only substrate-bound but also un-bound, free eIF4A1 molecules participate in the unwinding reaction ( Figure 3G, H). To visualise substrate binding and unwinding simultaneously, we performed a dual-colour gel shift unwinding assay under similar conditions. This confirmed that, under clamping conditions (silvestrol) and in the absence of ATP, eIF4A1 was fully bound to the AG-substrate without unwinding it under both high and low multimerisation conditions ( Figure 3I, lanes 4 and 8). Yet, only under high multimerisation conditions did addition of ATP induce strand separation ( Figure 3I, lanes 5 + 6 versus 9 + 10) and, in addition, overhang-clamped eIF4A1 strongly stimulated unwinding and remained bound to the overhang strand after the reaction ( Figure 3I, lane 6 and Supplementary Figure S3M).
In conclusion, these data suggest that RNA sequencespecific unwinding by eIF4A1, particularly on AG-RNA, is mediated by the overhang sequence of the substrate, allowing eIF4A1-multimerisation that enables cooperation between overhang-bound and -unbound eIF4A1 molecules. These effects are enhanced by AG-repeat sequences.

Different subunits within the eIF4A1-multimer operate distinctly to enable RNA sequence-specific unwinding
To understand the functional connection between overhang-bound and unwinding-performing subunits within the multimeric eIF4A1 complex better, we aimed to probe for functional cooperativity within the eIF4A1 multimer directly. For this, we followed an approach that has been used for multimeric ATPases and helicases previously (62)(63)(64), in which, briefly, the catalytic activity of the multimeric enzyme is monitored when wildtype and an inactive variant are mixed at different fractions but at the same total protein concentration. Absence of functional cooperativity would result in a linear trend,with x + y = 1 (62), plotting activity versus fraction of the inactive variant. Mixing eIF4A1 wt with catalytically inactive eIF4A1 DQAD (Supplementary Figures S1M and S3K) (55), our results show a differential level of functional cooperativity for eIF4A1 unwinding between the AG-and CAA-overhang ( Figure 4A), which correlates with differential unwinding activity on these substrates. This suggested specific activation of eIF4A1 unwinding underlies enhanced functional cooperativity between eIF4A1-subunits within the multimeric eIF4A1 complex in an overhang sequence-dependent manner.
We next asked if functional cooperativity between eIF4A1 subunits stems from participation of the overhangbound eIF4A1 directly in the strand separation reaction. For this we asked if binding of catalytically inactive eIF4A1 DQAD to the overhang of the substrate before addition of eIF4A1 wt inhibits or activates the helicase activity of eIF4A1 wt ( Figure 4B). To do this, we clamped eIF4A1 DQAD first to the overhang of the substrate using silvestrol, which recovered wildtype-like RNA-binding affinity as well as kinetic stability without rescuing its unwinding activity (Supplementary Figures S4A-C). Thus, this excluded the possibility that additional eIF4A1 wt could replace overhangclamped eIF4A1 DQAD during the experiment. Strikingly, clamping inactive eIF4A1 DQAD to the overhang supported unwinding by eIF4A1 wt at a rate similar to the eIF4A1 wtonly reaction ( Figure 4B and Supplementary Figure S4D) indicating beneficial cooperation between the overhangbound, catalytically inactive eIF4A1 DQAD and unwindingactive eIF4A1 wt . This strongly suggested that the different eIF4A1-copies within multimeric eIF4A1 have different functions. Supporting this, the overall ATPase activity was reduced when eIF4A1 DQAD is clamped to the overhang (Supplementary Figure S4E), while unwinding was unaffected ( Figure 4B) In this setup, the observed ATPase activity is exclusively performed by eIF4A1 wt that performs the actual strand separation, thus, in the wt-only multimeric eIF4A1 complex, the subunits bound to the overhang and subunits performing the strand-separation have different ATPase activities.
In summary, overhang-bound eIF4A1 is not directly involved in unwinding but critical for loading and activating proximal strand separation by distinct eIF4A1 molecules. We thus refer to roles of these different subunits within the multimeric eIF4A1 complex as loading, i.e. bound to the single-stranded RNA overhang, and unwinding, i.e. performing the strand separation of the double-stranded RNA region. As Hill-coefficients under clamping conditions indicate activity of two unwinding subunits on short and long duplexes ( Figure 4C), we suggest a model in which the eIF4A1-RNA-loading complex activates at least two unwinding subunits ( Figure 4D). The catalytic capacity of the loading-complex appears dispensable suggesting a bindinginduced mechanism of activation.

eIF4A1 cofactors operate distinctly upon multimeric eIF4A1
Cellular eIF4A1 function is believed to be tightly regulated through interactions with its cofactors eIF4H, eIF4B and eIF4G (18)(19)(20)(21). As our initial results showed that the pattern of RNA sequence-specific unwinding activity of eIF4A1 is differently affected by different eIF4A1 cofactors, we therefore investigated if the cofactors operate upon multimeric eIF4A1. Our Supplementary Results demonstrate (i) that stimulation of RNA sequence-specific unwinding of eIF4A1 by eIF4G or eIF4H is optimal under conditions that allow multimeric eIF4A1 complex formation, (ii) that stimulation by eIF4G or eIF4H occurs in an RNA sequence-specific manner. and (iii) that eIF4G and eIF4H operate differently on multimeric eIF4A1, with eIF4G functioning upon or replacing the loading subunit while eIF4H improves activity of the unwinding subunits (for detailed presentation, see Supplementary Results and Supplementary Figure S5). Altogether, these results demonstrate that activity of eIF4A1 cofactors differentially stimulates multimeric eIF4A1 complexes to facilitate distinct RNA sequence-specific unwinding activities.

RNA sequence-specific eIF4A1 complexes
To investigate how the RNA sequence facilitates activation of eIF4A1 unwinding at a structural level, we next examined the shape of complexes formed between eIF4A1 bound to different single-stranded RNAs using small-angle Xray scattering (SAXS). Envelope models revealed that apo-eIF4A1 fitted better to eIF4A in an open but not in a closed conformation, which was in agreement with an extended conformation of the apo-protein (Supplementary Figure  S6A). Moreover, the shape of the eIF4A1-CAA-RNA complex (eIF4A1 bound to CAA-RNA) suggested a similarly extended conformation as observed with apo-eIF4A1, while the eIF4A1-AG-RNA complex (eIF4A1 bound to AG-RNA) was in a different, more compact conformation as compared to eIF4A1-CAA-RNA and apo-eIF4A1 ( Figure 5A and Supplementary Figure S6B, C and Supplementary Table S5). In support, linear free energy relationship measurements (65) demonstrated a higher proportion of both ionic and non-ionic interactions in the eIF4A1-AG-RNA complex than in the eIF4A1-CAA-RNA complex suggesting distinct and RNA specific eIF4A1-AGand eIF4A1-CAA-RNA binding interfaces (Supplementary Figure S6D). Since it has been shown recently that RNA length modulates the conformation of yeast eIF4A (27), this altogether strongly suggested that interactions of human eIF4A1 with RNA length and sequence guide specific conformational transitions in the protein.
SAXS of multimeric eIF4A1-AG-RNA complexes revealed a non-linear shape of the complex with a larger radius of gyration (R g ) and volume of correlation (V C ) than monomers ( Figure Table S5), more than one eIF4A1 subunit within multimeric eIF4A1 could be RNA associated. It is unlikely that more than one subunit binds tightly to the ssRNA because: a 10 nt AG-RNA provides only one direct eIF4A1 binding site, as shown by a recent crystal structure of eIF4A1 in complex with the 10 nt AG-RNA and a silvestrol derivate (66), but eIF4A1 multimerisation is still observed on a 10 nt AG-RNA in the presence of silvestrol ( Figure 5C and Supplementary S3G). In conclusion, RNA sequence is critical to establish specific binding interfaces and thus conformational states of eIF4A1 that allow formation of multimeric complexes in which only one eIF4A1 protein is in tight contact with the single-stranded RNA region.
To investigate how eIF4A1-loading complexes activate unwinding, we next performed SAXS on multimeric eIF4A1 complexes bound to the AG-overhang substrate in the presence of AMP-PNP, which reflects the loaded, preunwinding state ( Figure 3I, lane 4). Superposition with the multimeric eIF4A1-AG-RNA complex enabled identification of the overhang (eIF4A1 covered) and the duplex region of the substrate ( Figure 5D and Supplementary Figure  S6H-I). The measured duplex diameter was slightly larger (24Å versus 32Å ∼ 33%) than expected, indicating an underestimation of dimensions in the envelopes. Surprisingly, the length of the detected duplex region was shorter than the expected length (59Å/21 bp versus 67Å/24 bp), suggesting that the eIF4A1-loading complex is located precisely at the overhang-duplex fork and may be covering parts of the duplex region. Fitting a 24 bp dsRNA into the envelope suggests ∼5 bp might be buried inside the multimeric eIF4A1loading complex (Supplementary Figure S6J). Moreover, FRET experiments focusing on the overhang-fork region of the substrate demonstrated a conformational change in the RNA upon eIF4A1 loading complex formation specific to the multimeric state ( Figure 5E and Supplementary 6K, L).
Taken together our data support a mechanism in which eIF4A1 undergoes RNA sequence-specific conformational changes that trigger assembly of multimeric eIF4A1-RNA complexes. Within the multimers, the RNA overhang region adopts a conformation that places eIF4A1 subunits directly at the overhang-fork region and partially onto the duplex region. We hypothesise that this is critical for activation of sequence-specific unwinding.

DISCUSSION
The DEAD-box RNA helicase eIF4A1 catalyses at least two major reactions in translation initiation. First, eIF4A1 activity is essential to load mRNAs onto the 43S preinitiation complex (PIC) and, second, eIF4A1-dependent unwinding of RNA secondary structure facilitates translocation of the PIC along the mRNAs' 5' UTR with high structural content (6)(7)(8)(9)(10). In this study we uncover a mechanism for how such eIF4A1-dependent mRNAs specifically recruit and activate eIF4A1 unwinding activity. Our data reveal that, in vitro and in cells, (i) eIF4A1 helicase activity is induced in an RNA sequence-specific manner through eIF4A1-multimerisation and (ii) that this mechanism of eIF4A1 regulation is used by eIF4A1-dependent mR-NAs to overcome translational repression due to localised RNA structure ( Figure 5F). Within the 5'UTR of eIF4A1dependent mRNAs, we identify specific RNA sequence motifs, particularly enriched for polypurines, which function to specifically recruit and trigger eIF4A1-multimerisation to activate eIF4A1-dependent strand separation of local repressive RNA structure to facilitate mRNA translation.
In order to examine the relationship between eIF4A1dependent unwinding and translation in cells, we combined RNA-structure-seq2 with TMT-pulsed SILAC following eIF4A1 inhibition. In contrast to previous studies, which typically consider results of single time points representing pre-or steady states limiting dynamics, (5,(12)(13) we performed a time course experiment to directly quantify translation rates of newly synthesised proteins immediately after eIF4A1-inhibition (Figure 2A-B). This identified hippuristanol-sensitive/eIF4A1-dependent and hippuristanol-resistant/eIF4A1-independent mRNAs ( Figure 2C), which we validated in reporter assays ( Figure  2E, F). Our study finds that eIF4A1-dependent mRNAs do not have longer-than-average 5'UTRs nor increased GC content in contrast to previous reports (5,(12)(13). Proteins identified in our study are detected by a threshold minimum rate of incorporation of the metabolic labelling agent, hence mRNA groups identified through the analysis exhibit fast translation rates naturally. This allowed us to identify features of eIF4A1-dependent mRNAs that increase their sensitivity to eIF4A1-activity. This was achieved through analyses of two independent approaches (RNA structure-seq2 and TMT pulsed SILAC) which revealed RNA sequencedependent activities of eIF4A1 in cells. This allowed us to define eIF4A1-dependent mRNAs that contain AG-rich motifs in their 5'UTR that essentially facilitate eIF4A1dependent translation. These findings agree with our in vitro data showing that eIF4A1-unwinding activity is stimulated in an RNA sequence-dependent manner, with polypurinerich sequences enhancing eIF4A1 unwinding the most. In support of previous reports (22), this activity of eIF4A1    A, B, D) Small-angle x-ray scattering-based (SAXS) envelope model of (A) apo-eIF4A1 (ligand-free, black), monomeric eIF4A1 bound to AG-(blue) or CAA-RNA (red), respectively, (B) multimeric eIF4A1 bound to AG-RNA (purple) or (D) AG-overhang substrate (green), respectively. Monomer and multimers were generated according to results shown in Figure 3. V C and R g -envelope volume of correlation and radius of gyration derived from experimental data; D max -maximum distance is mean ± sd from four measurements based SAXS models using PyMOL2. P(r)-derived D max is summarised in Supplementary Table S5. (C) Representative EMSA of 3 M eIF4A1 and 10 nt labelled-AG-RNA in the presence of eIF4A1-inhibitors, n = 3 repeat experiments. (E) Changes in FRET efficiency of labelled AG-overhang substrate upon multimeric and monomer eIF4A1-binding using 0 and 6 M competitor RNA, respectively (see Figure 3G); data are mean ± SD from four repeat experiments, n = 4. P-values were calculated by one-way ANOVA for paired data. (F) Model of RNA sequence-regulated activities of eIF4A1. eIF4A1 adopts RNA sequence-specific conformations upon RNA-binding allowing either loading complex formation or multimeric complex formation. mRNAs with AG-rich sequences specifically recruit eIF4A1, enabling assembly of the helicase-active multimeric eIF4A1 complex, and positioning these complexes proximal to stable localised RNA structure allowing ribosomal subunit scanning. displays a strong 5'-3' directional and positional bias in cells and in vitro, demonstrating specificity and an activityguiding role of the AG-sequences. mRNAs that contain such AG5-motifs to regulate their translation include welldescribed eIF4A1-dependent mRNAs such as myc targets, and mRNAs encoding components of cell cycle regulation and mTORC-signalling ( Figure 2J). Together, this describes a model in which eIF4A1-dependent mRNAs use AG-rich motifs in their 5'UTR to recruit and specifically activate eIF4A1-unwinding to regulate their translation (Figure 5F).
Mechanistically, our data shows the specific sequence information within the RNA enhances unwinding by eIF4A1 by promoting eIF4A1-multimerisation through an RNA-centric mechanism. Following RNA-sequence specific binding ( Figure 1A), eIF4A1 forms ATPase-active but unwinding-inefficient monomeric complexes ( Figure 3G-H) or unwinding-activated multimeric complexes ( Figure  3B and I) directed by RNA sequence ( Figure 5F). Activation of unwinding is achieved by a specific division of catalytic capacities between the different eIF4A1-subunits, overhang-bound and unwinding subunits, within multimeric eIF4A1 ( Figure 4B and Supplementary Figure S4E), such that overhang-bound eIF4A1 does not directly participate in the unwinding step but stimulates duplex separation by additional eIF4A1-subunits ( Figure 4B). Our model of eIF4A1-activation by single-stranded RNA extends a recent model described for yeast eIF4A (27) by emphasising the importance of the exact nucleotide sequence over the length of the single-stranded RNA in the regulation of eIF4A1 function. Additionally, we go on to show that the nucleotide information induces eIF4A1-multimerisation by an initiating single eIF4A1 binding to the single-stranded overhang of the substrate ( Figure 5C). Subsequently this complex then undergoes conformational changes that allow recruitment of additional eIF4A1 and thus formation of the multimeric eIF4A1-loading complex ( Figure 5A, B). Assembly of this complex changes the conformation of the single-stranded RNA region ( Figure 5E) such that eIF4A1 subunits are positioned at the fork of the proximal RNA duplex ( Figure 5D and Supplementary Figure S6J). This then allows enhanced engagement of eIF4A1-unwinding subunits with the duplex stimulating strand separation. We hypothesise that eIF4A1-loading subunits transition dynamically into unwinding subunits which enables recruitment of new loading subunits, free eIF4A1, fuelling the unwinding reaction. This would be in agreement with a requirement of free eIF4A1 for efficient unwinding ( Figure  3G-I). A change in the conformation of the single-stranded RNA region upon helicase binding and a similar multimerisation model has been described for of the Ded1p/DDX3 family previously (64,67). However, while Ded1p/DDX3 binds ssRNA regardless of ATP (68), eIF4A1's ssRNAbinding is dependent on simultaneous ATP-binding and thus eIF4A1's ATPase activity (22,55). Interestingly, our results show that ATP-turnover of the loading subunits per se is not essential for subsequent unwinding ( Figure 4B and Supplementary Figure S4E). Together, this suggests that the ATPase activity of the eIF4A1-loading complexes controls their kinetic stability and thus activation of unwinding as opposed to a direct contribution of the ATPase activity to the strand separation reaction itself. This could in part ex-plain the different unwinding activities of eIF4A1 on different overhang sequences. In agreement, silvestrol-clamped eIF4A1 showed increased unwinding activity.
Multimerisation of DEAD-box helicases as a requirement for efficient RNA strand separation has also been reported for the yeast Ded1p, and its human homolog DDX3X, cold-shock activated helicase CshA and heatresistant RNA-dependent ATPase Hera (64,(69)(70)(71). These studies present a range of modes how helicases multimerise: While DDX3X/Ded1p forms multimeric complexes readily in the absence of RNA (64), complex formation of CshA and Hera is mediated by unique dimerization domains (70,71). Our data shows, that eIF4A1, in contrast, follows a distinct mechanism. eIF4A1-multimer formation is dependent on RNA-binding ( Figure 3B and Supplementary Figure S3F-G) and occurs in an RNA sequence-specific manner with polypurine repeats triggering efficient multimerisation ( Figures 1A, 3B, C). Importantly, we observe significant eIF4A1-multimerisation in vitro at protein concentrations lower than 5 M which is lower than the cellular eIF4A1 concentration of 10-20 M (59) and thus would strongly support multimerisation of eIF4A1 occurring in cells. In agreement, eIF4A1-multimers are active in cells, we (i) visualised direct eIF4A1-eIF4A1 interactions (Figure 3D, E) and (ii) reveal RNA sequence-specific unwinding ( Figure 2H) as well as stimulation of translation by eIF4A1 in cells (Figures 2G and I).
In the cellular environment, the majority of eIF4A1 functions is believed to rely on interactions between eIF4A1 and its cofactors including eIF4G, eIF4B and eIF4H, that collectively stimulate eIF4A1's catalytic capacities (18)(19)(20)(21). It has been shown mechanistically that the different cofactors affect the rates of conformational transitions within eIF4A1 thus guiding eIF4A1 through its catalytic cycle. Our results extend the existing models and shows that cofactors also operate efficiently upon eIF4A1-multimers (Supplementary Figure S5B-D). Our data are consistent with a model in which eIF4H and eIF4G operate on distinct eIF4A1-subunits to deliver their function which allows synergistic activation of multimeric eIF4A1. While eIF4H stabilises the loading complex and stimulates activity of the unwinding subunits, eIF4G functions on or replaces the loading subunits. Additionally, we observe that in the presence of eIF4G the communication between eIF4A1subunits is strongly reduced (Supplementary Figure S5K) suggesting that eIF4G can replace the eIF4A1-loading subunits. A similar observation has been described for the Ded1p-eIF4G interaction (64). As a consequence, RNAbinding specificities are delivered through eIF4G rather than eIF4A1. In support of our model, (i) eIF4G contains two eIF4A1-binding sites which each induce different catalytic properties of eIF4A1 upon binding (20,(72)(73), and (ii) silvestrol affected the activity of cofactor-containing eIF4A1-multimers distinctly, i.e. silvestrol inhibited activity of eIF4G-containing multimeric complexes while it stimulated eIF4H-containing eIF4A1-complexes (Supplementary Figure S5L). This suggests that the mode of action of silvestrol to stimulate unwinding is to clamp and stabilise the loading eIF4A1 subunits. As a result, silvestrol inhibits multimeric eIF4A1 complexes that do not contain eIF4A1 loading subunits, like the eIF4G-containing ones, by clamping and thus inactivating an eIF4A1-unwinding subunit. This is in agreement with recent reports showing that rocaglamides appear to specifically reduce the unwinding activity of eIF4E-independent (constitutively active) eIF4F variants (74).
Our data specifically suggest that multimeric eIF4A1 is critical for site-specific unwinding of RNA structures to facilitate cap-dependent translation regardless of cofactor activity ( Figure 2). Moreover, eIF4A1 and other DEADbox helicases, have recently been shown to be major regulators of RNA condensation which show helicase-mRNAspecific networks and can regulate translation (60,75). In their study, eIF4A1 was found to resolve RNA condensates in an unwinding-dependent manner. Our model suggests that eIF4A1 would operate on RNA condensates differentially depending on RNA concentration and RNA sequence composition, resolving preferentially those RNA condensates that allow eIF4A1 multimerisation to occur. However, as eIF4A1-cofactors change the specific activities of multimeric eIF4A1, we hypothesise that, in addition to cofactor activity, a variety of RNA sequences might coordinate eIF4A1 function to drive different translational programmes through recruitment and assembly of distinct multimeric eIF4A1-complexes. This concept might also explain the different 5'UTR features of mRNAs that have been described for eIF4A1-dependent mRNAs. Depending on the approach, the networks between the different multimeric eIF4A1-cofactor complexes might be differentially affected highlighting different but specific groups of eIF4A1-dependent mRNAs. Further, as the active concentration of translation initiation factors including eIF4F is (i) tightly controlled in cellular programmes like proliferation and differentiation (76)(77)(78), (ii) can vary between tissues and iii) is often dramatically affected in many different cancers (59,(79)(80)(81), regulation and dysregulation of eIF4A1 multimer formation is likely to have a strong impact on the translational landscape of the cell.
Given the strong evolutionary conservation of the DEAD-box helicase core, it is likely that comparable mechanisms of RNA-based activation of unwinding and hence regulation helicase function are found among the entirety of this protein family (64,67).
SAXS data was submitted to the Small Angle Scattering Biological Data Bank and is accessible via https://www. sasbdb.org/project/1902/v0yo6ull0o. Models are available in the supplementary information.
The TMT-pulsed SILAC mass spectrometry data have been deposited to the ProteomeXchange Consortium via the PRIDE (82) partner repository with the dataset identifier PXD034343.
[MEDPG21F\5 to S. Z.]. Funding for open access charge: Cancer Research UK Core Funding. Conflict of interest statement. M.B.'s lab collaborates with Cancer Research UK's Therapeutic Discovery Laboratories on drug discovery against some of the targets in this paper.